A Web Scraping Methodology for Bypassing Twitter API Restrictions

نویسندگان

  • A. Hernandez-Suarez
  • G. Sanchez-Perez
  • K. Toscano-Medina
  • V. Martinez-Hernandez
  • V. Sanchez
  • H. Perez-Meana
چکیده

Retrieving information from social networks is the first and primordial step many data analysis fields such as Natural Language Processing, Sentiment Analysis and Machine Learning. Important data science tasks relay on historical data gathering for further predictive results. Most of the recent works use Twitter API, a public platform for collecting public streams of information, which allows querying chronological tweets for no more than three weeks old. In this paper, we present a new methodology for collecting historical tweets within any date range using web scraping techniques bypassing for Twitter API restrictions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Web scraping technologies in an API world

Web services are the de facto standard in biomedical data integration. However, there are data integration scenarios that cannot be fully covered by Web services. A number of Web databases and tools do not support Web services, and existing Web services do not cover for all possible user data demands. As a consequence, Web data scraping, one of the oldest techniques for extracting Web contents,...

متن کامل

A Web Recommendation System based on Individual Preference Estimated from Twitter

This paper proposes a web recommendation system that estimates and dynamically updates individual preference with twitter, in order to reduce web search effort. The proposed system gathers personal comments on twitter, extracts objectpredicate pairs by text analysis, and ranks the objects with weighting of the paired predicates in accordance with a prepared predicate-point dictionary such as “l...

متن کامل

A Supervised Approach To Musical Chord Recognition

In this paper, we present a prototype of an online tool for real-time chord recognition, leveraging the capabilities of new web technologies such as the Web Audio API, and WebSockets. We use a Hidden Markov Model in conjunction with Gaussian Discriminant Analysis for the classification task. Unlike approaches to collect data through web-scraping or training on hand-labeled song data, we generat...

متن کامل

WAAX: Web Audio API eXtension

The introduction of the Web Audio API in 2011 marked a significant advance for web-based music systems by enabling real-time sound synthesis on web browsers simply by writing JavaScript code. While this powerful functionality has arrived there is a yet unaddressed need for an extension to the API to fully reveal its potential. To meet this need, a JavaScript library dubbed WAAX was created to f...

متن کامل

Programmatic Interfaces for Web Applications

e Bay’s launch of its API in November 2000 marked the beginning of an era in which Web applications offer services for third-party application integration. The rapid growth of programmatic interfaces for Web applications has recently revolutionized online content integration and created new opportunities for vendors to build developer ecosystems. According to ProgrammableWeb, a leading service ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018